Dual coding in alternative reading frames correlates with intrinsic protein disorder.

نویسندگان

  • Erika Kovacs
  • Peter Tompa
  • Karoly Liliom
  • Lajos Kalmar
چکیده

Numerous human genes display dual coding within alternatively spliced regions, which give rise to distinct protein products that include segments translated in more than one reading frame. To resolve the ensuing protein structural puzzle, we identified 67 human genes with alternative splice variants comprising a dual-coding region at least 75 nucleotides in length and analyzed the structural status of the protein segments they encode. The inspection of their amino acid composition and predictions by the IUPred and PONDR VSL2 algorithms suggest a high propensity for structural disorder in dual-coding regions. In the case of +1 frameshifts, the average level of disorder in the two frames is similarly high (47.2% in the ancestral frame, 58.2% in the derived frame, with the average level of disorder in human proteins being approximately 30%), whereas in the case of -1 frameshifts, there is a significant tendency to become more disordered upon shifting the frame (16.7% in the ancestral frame, 56.3% in the derived frame). The regions encoded by the derived frame are mostly disordered (disorder percentage > 50%) in 39 out of 62 cases, which strongly suggests that structural disorder enables these protein products to exist and function without the need of a highly evolved 3D fold. The potential advantages are also demonstrated by the appearance of novel functions and the high incidence of transcripts escaping nonsense-mediated decay. By discussing several examples, we demonstrate that dual coding may be an effective mechanism for the evolutionary appearance of novel intrinsically disordered regions with new functions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rapid asymmetric evolution of a dual-coding tumor suppressor INK4a/ARF locus contradicts its function.

INK4a/ARF tumor suppressor locus encodes two protein products, INK4a and ARF, essential for controlling tumorigenesis and mutated in more than half of human cancers. There is no resemblance between the two proteins: their coding regions are assembled by alternative splicing of two mutually exclusive 5' exons into a constitutive one containing overlapping out-of-phase reading frames. We show tha...

متن کامل

Selection Pressure in Alternative Reading Frames

Overlapping genes are two protein-coding sequences sharing a significant part of the same DNA locus in different reading frames. Although in recent times an increasing number of examples have been found in bacteria the underlying mechanisms of their evolution are unknown. In this work we explore how selective pressure in a protein-coding sequence influences its overlapping genes in alternative ...

متن کامل

A First Look at ARFome: Dual-Coding Genes in Mammalian Genomes

Coding of multiple proteins by overlapping reading frames is not a feature one would associate with eukaryotic genes. Indeed, codependency between codons of overlapping protein-coding regions imposes a unique set of evolutionary constraints, making it a costly arrangement. Yet in cases of tightly coexpressed interacting proteins, dual coding may be advantageous. Here we show that although dual ...

متن کامل

Birth of a unique enzyme from an alternative reading frame of the preexisted, internally repetitious coding sequence.

The mechanism of gene duplication as the means to acquire new genes with previously nonexistent functions is inherently self limiting in that the function possessed by a new protein, in reality, is but a mere variation of the preexisted theme. As the source of a truly unique protein, I suggest an unused open reading frame of the existing coding sequence. Only those coding sequences that started...

متن کامل

Effect of focal ischemia on long noncoding RNAs.

BACKGROUND AND PURPOSE Long noncoding RNAs (lncRNAs) play a significant role in cellular physiology. We evaluated the effect of focal ischemia on the expression of 8314 lncRNAs in rat cerebral cortex using microarrays. METHODS Ischemia was induced by transient middle cerebral artery occlusion. Genomic and transcriptomic correlates of the stroke-responsive lncRNAs and the transcription factor ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proceedings of the National Academy of Sciences of the United States of America

دوره 107 12  شماره 

صفحات  -

تاریخ انتشار 2010